This is a walkthrough demonstrating how to generate SWNE plots alongside the Seurat manifold alignment pipeline from three pancreas datasets generated using different single cell RNA-seq technologies.
To save time we will be using the pre-computed Seurat object pancreas_integrated_seurat.Robj, which can be downloaded here.
First let’s load the required libraries
library(Seurat)
library(swne)
Let’s load the Seurat object
se.obj <- readRDS("~/swne/Data/multiple_pancreas_seurat_v3.RObj")
We can extract the integrated expression matrix
aligned.counts <- as.matrix(GetAssayData(se.obj, assay = "integrated"))
Note that this matrix has negative values, which we will have to rescale in order to run NMF
dim(aligned.counts)
## [1] 2000 5683
hist(as.matrix(aligned.counts))
aligned.counts <- t(apply(aligned.counts, 1, function(x) (x - min(x))/(max(x) - min(x))))
Once we have the “reconstructed matrix”, we can run NMF and embed the components
k <- 20
n.cores <- 8
nmf.res <- RunNMF(aligned.counts, k = k, alpha = 0, init = "ica", n.cores = n.cores)
pc.emb <- Embeddings(se.obj, "pca")
snn <- CalcSNN(t(pc.emb), k = 20, prune.SNN = 0)
swne.embedding <- EmbedSWNE(nmf.res$H, snn, alpha.exp = 1.5, snn.exp = 1,
n_pull = 3)
## Initial stress : 0.29636
## stress after 10 iters: 0.16434, magic = 0.004
## stress after 20 iters: 0.14331, magic = 0.043
## stress after 30 iters: 0.09593, magic = 0.500
## stress after 40 iters: 0.09127, magic = 0.500
## stress after 50 iters: 0.09041, magic = 0.500
## stress after 60 iters: 0.09006, magic = 0.500
## stress after 70 iters: 0.08971, magic = 0.500
## stress after 80 iters: 0.08968, magic = 0.500
We project the full gene expression matrix to get gene loadings for the full set of genes
norm.counts <- ExtractNormCounts(se.obj, rescale = T, rescale.method = "log", batch = NULL)
## calculating variance fit ... using gam
nmf.res$W <- ProjectFeatures(norm.counts, nmf.res$H, n.cores = n.cores)
We use these gene loadings to identify key genes to embed, and we hide all the factors
gene.factor.df <- SummarizeAssocFeatures(nmf.res$W, features.return = 1)
genes.embed <- unique(gene.factor.df$feature)
swne.embedding <- EmbedFeatures(swne.embedding, nmf.res$W, genes.embed, n_pull = 3)
swne.embedding$H.coords$name <- ""
We pull out the clusters and batches
clusters <- se.obj$celltype
batch <- se.obj$tech
We can then create the SWNE plot
PlotSWNE(swne.embedding, alpha.plot = 0.4, sample.groups = clusters, do.label = T,
label.size = 3.5, pt.size = 1.25, show.legend = F, seed = 42)
We also can show that there are no batch effects
PlotSWNE(swne.embedding, alpha.plot = 0.4, sample.groups = batch, do.label = F,
label.size = 3.5, pt.size = 0.25, show.legend = T, seed = 42)
UMAP plot for comparison
tsne.emb <- Embeddings(se.obj, "umap")
PlotDims(tsne.emb, sample.groups = clusters, show.legend = F, seed = 42,
show.axes = F)